Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#2620] feat(spark-connector): support hive table format properties #2605

Merged
merged 11 commits into from
Mar 29, 2024

Conversation

FANNG1
Copy link
Contributor

@FANNG1 FANNG1 commented Mar 20, 2024

What changes were proposed in this pull request?

support hive table format properties

CREATE TABLE xxx STORED AS PARQUET 
CREATE TABLE xxx USING PARQUET 
CREATE TABLE xxx ROW FORMAT SERDE xx STORED AS INPUTFORMAT xx OUTPUTFORMAT xx 
CREATE TABLE xxx ROW FORMAT DELIMITED FIELDS TERMINATED xx

Why are the changes needed?

Fix: #2620

Does this PR introduce any user-facing change?

no

How was this patch tested?

UT and IT

@FANNG1 FANNG1 force-pushed the properties branch 3 times, most recently from bc0e4cb to b485ac4 Compare March 21, 2024 03:32
@FANNG1 FANNG1 changed the title [SIP][Don't merge][spark-connector] support hive format properties [#2620] feat(spark-connector): support hive table format properties Mar 21, 2024
@FANNG1
Copy link
Contributor Author

FANNG1 commented Mar 21, 2024

It's ready to review now, @jerryshao @qqqttt123 @mchades @yuqi1129 @diqiu50 please help to review when you are free.

@FANNG1
Copy link
Contributor Author

FANNG1 commented Mar 26, 2024

using properties defined in gravitino-bundled-catalog, @mchades @yuqi1129 @diqiu50 please help to review again.

@FANNG1
Copy link
Contributor Author

FANNG1 commented Mar 27, 2024

@yuqi1129 @mchades @diqiu50 ,is there any other comments? @jerryshao do you have time to review?

@mchades
Copy link
Contributor

mchades commented Mar 27, 2024

Overall LGTM

@FANNG1
Copy link
Contributor Author

FANNG1 commented Mar 28, 2024

@jerryshao do you have time to review this PR?

Comment on lines +31 to +34
public static final String SPARK_HIVE_STORED_AS = "hive.stored-as";
public static final String SPARK_HIVE_INPUT_FORMAT = "input-format";
public static final String SPARK_HIVE_OUTPUT_FORMAT = "output-format";
public static final String SPARK_HIVE_SERDE_LIB = "serde-lib";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these configurations defined by Spark or by you?

Copy link
Contributor Author

@FANNG1 FANNG1 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark use raw string like hive.stored-as, I defined SPARK_HIVE_STORED_AS to refer to it.

HivePropertiesConstants.GRAVITINO_HIVE_SERDE_LIB);

/**
* CREATE TABLE xxx STORED AS PARQUET will save "hive.stored.as" = "PARQUET" in property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it "hive.stored-as" or "hive.stored.as"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hive.stored-as, will correct it

@FANNG1
Copy link
Contributor Author

FANNG1 commented Mar 29, 2024

@jerryshao all comments are addressed, please help to review again

@jerryshao
Copy link
Contributor

LGTM. Thanks @FANNG1 for your work.

@jerryshao jerryshao merged commit c6f08c6 into apache:main Mar 29, 2024
19 checks passed
@Yangxuhao123
Copy link
Contributor

Sorry, I've dealt with some things during this time. Have you completed this part of the work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] [spark-connector] support hive table format related properties
6 participants